Skip to content

Conversation

@Shahab96
Copy link
Collaborator

@Shahab96 Shahab96 commented Nov 8, 2025

This commit adds advanced multi-pool scheduling capabilities and fixes critical bugs discovered through comprehensive analysis of the RustFS source code (~/git/rustfs).

Critical Bug Fixes

Verified against RustFS source code (crates/config/src/constants/app.rs):

  1. Fix console port: 9090 → 9001

    • RustFS DEFAULT_CONSOLE_ADDRESS is ":9001", not 9090
    • Affects: services.rs, workloads.rs
  2. Fix IO service port: 90 → 9000

    • S3 API standard port is 9000
    • RustFS DEFAULT_ADDRESS is ":9000"
    • Affects: services.rs
  3. Add required RustFS environment variables:

    • RUSTFS_ADDRESS=0.0.0.0:9000
    • RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001
    • RUSTFS_CONSOLE_ENABLE=true
    • Without these, RustFS containers fail to start properly
    • Verified from RustFS docker-compose.yml and Helm chart
  4. Standardize volume paths to RustFS convention:

    • Before: /data/{N} (custom)
    • After: /data/rustfs{N} (RustFS standard)
    • RUSTFS_VOLUMES: .../data/rustfs{0...N}
    • Matches RustFS Helm chart, MNMD examples, docker-compose

Multi-Pool Scheduling Enhancements

Added comprehensive Kubernetes scheduling capabilities per pool:

  1. Created SchedulingConfig struct:

    • nodeSelector - Target specific nodes by labels
    • affinity - Complex node/pod affinity rules
    • tolerations - Schedule on tainted nodes
    • topologySpreadConstraints - Distribute across failure domains
    • resources - CPU/memory requests and limits
    • priorityClassName - Override tenant-level priority
  2. Uses #[serde(flatten)] for clean code organization:

    • Groups related scheduling fields
    • Maintains flat YAML structure (backward compatible)
    • Follows industry pattern (MongoDB, PostgreSQL operators)
  3. Enables advanced deployment patterns:

    • Hardware targeting (nodeSelector for specific hardware types)
    • Geographic distribution (affinity for regions/zones)
    • Spot instance optimization (tolerations for spot taints)
    • High availability (topology spread across zones)
    • Resource differentiation (different CPU/memory per pool)

Implementation Details

  • Pool-level scheduling fields applied to StatefulSet PodSpec
  • Pool-level resources applied to Container
  • Pool-level priority class overrides tenant-level with fallback
  • All fields optional (100% backward compatible)
  • Re-exported SchedulingConfig from v1alpha1 module

Testing

  • Added 5 new tests for scheduling field propagation
  • All tests passing (25/25)
  • Verified node selector, tolerations, priority, resources

Breaking Changes

None. All new fields are Option, existing Tenants work unchanged.

Verification

All changes verified against:

  • RustFS source: ~/git/rustfs/rustfs/src/config/mod.rs
  • RustFS constants: ~/git/rustfs/crates/config/src/constants/app.rs
  • RustFS Helm chart: ~/git/rustfs/helm/rustfs/
  • RustFS MNMD example: ~/git/rustfs/docs/examples/mnmd/

🤖 Generated with Claude Code

@Shahab96 Shahab96 requested a review from bestgopher as a code owner November 8, 2025 13:29
Shahab96 and others added 2 commits November 8, 2025 18:43
This commit adds advanced multi-pool scheduling capabilities and fixes
critical bugs discovered through comprehensive analysis of the RustFS
source code (~/git/rustfs).

## Critical Bug Fixes

Verified against RustFS source code (crates/config/src/constants/app.rs):

1. Fix console port: 9090 → 9001
   - RustFS DEFAULT_CONSOLE_ADDRESS is ":9001", not 9090
   - Affects: services.rs, workloads.rs

2. Fix IO service port: 90 → 9000
   - S3 API standard port is 9000
   - RustFS DEFAULT_ADDRESS is ":9000"
   - Affects: services.rs

3. Add required RustFS environment variables:
   - RUSTFS_ADDRESS=0.0.0.0:9000
   - RUSTFS_CONSOLE_ADDRESS=0.0.0.0:9001
   - RUSTFS_CONSOLE_ENABLE=true
   - Without these, RustFS containers fail to start properly
   - Verified from RustFS docker-compose.yml and Helm chart

4. Standardize volume paths to RustFS convention:
   - Before: /data/{N} (custom)
   - After: /data/rustfs{N} (RustFS standard)
   - RUSTFS_VOLUMES: .../data/rustfs{0...N}
   - Matches RustFS Helm chart, MNMD examples, docker-compose

## Multi-Pool Scheduling Enhancements

Added comprehensive Kubernetes scheduling capabilities per pool:

1. Created SchedulingConfig struct:
   - nodeSelector - Target specific nodes by labels
   - affinity - Complex node/pod affinity rules
   - tolerations - Schedule on tainted nodes
   - topologySpreadConstraints - Distribute across failure domains
   - resources - CPU/memory requests and limits
   - priorityClassName - Override tenant-level priority

2. Uses #[serde(flatten)] for clean code organization:
   - Groups related scheduling fields
   - Maintains flat YAML structure (backward compatible)
   - Follows industry pattern (MongoDB, PostgreSQL operators)

3. Enables advanced deployment patterns:
   - Hardware targeting (nodeSelector for specific hardware types)
   - Geographic distribution (affinity for regions/zones)
   - Spot instance optimization (tolerations for spot taints)
   - High availability (topology spread across zones)
   - Resource differentiation (different CPU/memory per pool)

## Implementation Details

- Pool-level scheduling fields applied to StatefulSet PodSpec
- Pool-level resources applied to Container
- Pool-level priority class overrides tenant-level with fallback
- All fields optional (100% backward compatible)
- Re-exported SchedulingConfig from v1alpha1 module

## Testing

- Added 5 new tests for scheduling field propagation
- All tests passing (25/25)
- Verified node selector, tolerations, priority, resources

## Breaking Changes

None. All new fields are Option<T>, existing Tenants work unchanged.

## Verification

All changes verified against:
- RustFS source: ~/git/rustfs/rustfs/src/config/mod.rs
- RustFS constants: ~/git/rustfs/crates/config/src/constants/app.rs
- RustFS Helm chart: ~/git/rustfs/helm/rustfs/
- RustFS MNMD example: ~/git/rustfs/docs/examples/mnmd/

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@Shahab96 Shahab96 force-pushed the feature/pool-scheduling-enhancements branch from 276811c to 736297d Compare November 8, 2025 13:43
@Shahab96
Copy link
Collaborator Author

Shahab96 commented Nov 8, 2025

Latest available image used for testing docker pull ghcr.io/shahab96/operator:7e12cb52f7b4047feeb7f0949af422dfbb83ec95

@Shahab96 Shahab96 marked this pull request as draft November 8, 2025 13:52
Shahab96 and others added 2 commits November 8, 2025 18:54
The operator was panicking at runtime with:
  "Could not automatically determine the process-level CryptoProvider from Rustls crate features"

This occurred because kube uses hyper-rustls which uses rustls, but the
rustls crypto backend (ring) wasn't being properly configured through the
dependency chain.

Solution:
- Enable rustls-tls feature on kube crate
- This ensures kube's rustls dependency gets the ring crypto provider
- The existing rustls = {version="0.23", features = ["ring"]} alone
  wasn't sufficient because it doesn't affect kube's dependency tree

The rustls-tls feature on kube properly configures:
- hyper-rustls with rustls backend
- rustls with ring crypto provider
- TLS support for Kubernetes API client

This is the standard solution for kube-rs applications.

References:
- https://github.com/kube-rs/kube/blob/main/kube-client/Cargo.toml
- https://docs.rs/kube/latest/kube/#features

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@Shahab96 Shahab96 marked this pull request as ready for review November 8, 2025 14:01
@bestgopher bestgopher merged commit cae69b0 into rustfs:main Nov 9, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants